Reviewer Integration and Performance Measurement for Malware Detection

نویسندگان

  • Brad Miller
  • Alex Kantchelian
  • Michael Carl Tschantz
  • Sadia Afroz
  • Rekha Bachwani
  • Riyaz Faizullabhoy
  • Ling Huang
  • Vaishaal Shankar
  • Tony Wu
  • George Yiu
  • Anthony D. Joseph
  • J. Doug Tygar
چکیده

We present and evaluate a large-scale malware detection system integrating machine learning with expert reviewers, treating reviewers as a limited labeling resource. We demonstrate that even in small numbers, reviewers can vastly improve the system’s ability to keep pace with evolving threats. We conduct our evaluation on a sample of VirusTotal submissions spanning 2.5 years and containing 1.1 million binaries with 778GB of raw feature data. Without reviewer assistance, we achieve 72% detection at a 0.5% false positive rate, performing comparable to the best vendors on VirusTotal. Given a budget of 80 accurate reviews daily, we improve detection to 89% and are able to detect 42% of malicious binaries undetected upon initial submission to VirusTotal. Additionally, we identify a previously unnoticed temporal inconsistency in the labeling of training datasets. We compare the impact of training labels obtained at the same time training data is first seen with training labels obtained months later. We find that using training labels obtained well after samples appear, and thus unavailable in practice for current training data, inflates measured detection by almost 20 percentage points. We release our cluster-based implementation, as well as a list of all hashes in our evaluation and 3% of our entire dataset.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DyVSoR: dynamic malware detection based on extracting patterns from value sets of registers

To control the exponential growth of malware files, security analysts pursue dynamic approaches that automatically identify and analyze malicious software samples. Obfuscation and polymorphism employed by malwares make it difficult for signature-based systems to detect sophisticated malware files. The dynamic analysis or run-time behavior provides a better technique to identify the threat. In t...

متن کامل

Malware Detection using Classification of Variable-Length Sequences

In this paper, a novel method based on the graph is proposed to classify the sequence of variable length as feature extraction. The proposed method overcomes the problems of the traditional graph with variable length of data, without fixing length of sequences, by determining the most frequent instructions and insertion the rest of instructions on the set of “other”, save speed and memory. Acco...

متن کامل

Practical Experiences with Purenet, a Self-Learning Malware Prevention System

This paper introduces Purenet, which is a self-learning malware detection system aimed at avoiding zero-day attacks and other delays in patching application systems when attacks are identified. The concept and architecture of Purenet are described, specifically positioning anomaly detection as the system enabler. Deployment of the system in an operational environment is discussed, and associate...

متن کامل

Performance measurement of detection and governmental punitive agency by integrated approach based on DEMATEL, ANP, and DEA

The detection and governmental punitive agency is responsible for supervision on correct implementation of the business laws in Iran. Several criteria are involved in performance assessment of detection and governmental punitive agency. The interaction between criteria and sub-criteria may occur in real situations. Moreover, the performance measurement should be accomplished in a multi-period h...

متن کامل

Probabilistic and Considerate Attestation of IoT Devices against Roving Malware

Remote Attestation (RA) is a popular means of detecting malware presence (or verifying its absence) on embedded and IoT devices. It is especially relevant to low-end devices that are incapable of protecting themselves against infection. Malware that is aware of ongoing or impending attestation and aims to avoid detection can relocate itself during computation of the attestation measurement. In ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016